46 research outputs found
Pruning artificial neural networks: a way to find well-generalizing, high-entropy sharp minima
Recently, a race towards the simplification of deep networks has begun,
showing that it is effectively possible to reduce the size of these models with
minimal or no performance loss. However, there is a general lack in
understanding why these pruning strategies are effective. In this work, we are
going to compare and analyze pruned solutions with two different pruning
approaches, one-shot and gradual, showing the higher effectiveness of the
latter. In particular, we find that gradual pruning allows access to narrow,
well-generalizing minima, which are typically ignored when using one-shot
approaches. In this work we also propose PSP-entropy, a measure to understand
how a given neuron correlates to some specific learned classes. Interestingly,
we observe that the features extracted by iteratively-pruned models are less
correlated to specific classes, potentially making these models a better fit in
transfer learning approaches
From Statistical Physics to Algorithms in Deep Neural Systems
L'abstract è presente nell'allegato / the abstract is in the attachmen
Learning Sparse Neural Networks via Sensitivity-Driven Regularization
The ever-increasing number of parameters in deep neural networks poses
challenges for memory-limited applications. Regularize-and-prune methods aim at
meeting these challenges by sparsifying the network weights. In this context we
quantify the output sensitivity to the parameters (i.e. their relevance to the
network output) and introduce a regularization term that gradually lowers the
absolute value of parameters with low sensitivity. Thus, a very large fraction
of the parameters approach zero and are eventually set to zero by simple
thresholding. Our method surpasses most of the recent techniques both in terms
of sparsity and error rates. In some cases, the method reaches twice the
sparsity obtained by other techniques at equal error rates
Can we avoid Double Descent in Deep Neural Networks?
Finding the optimal size of deep learning models is very actual and of broad
impact, especially in energy-saving schemes. Very recently, an unexpected
phenomenon, the ``double descent'', has caught the attention of the deep
learning community. As the model's size grows, the performance gets first
worse, and then goes back to improving. It raises serious questions about the
optimal model's size to maintain high generalization: the model needs to be
sufficiently over-parametrized, but adding too many parameters wastes training
resources. Is it possible to find, in an efficient way, the best trade-off? Our
work shows that the double descent phenomenon is potentially avoidable with
proper conditioning of the learning problem, but a final answer is yet to be
found. We empirically observe that there is hope to dodge the double descent in
complex scenarios with proper regularization, as a simple
regularization is already positively contributing to such a perspective
EnD: Entangling and Disentangling deep representations for bias correction
Artificial neural networks perform state-of-the-art in an ever-growing number
of tasks, and nowadays they are used to solve an incredibly large variety of
tasks. There are problems, like the presence of biases in the training data,
which question the generalization capability of these models. In this work we
propose EnD, a regularization strategy whose aim is to prevent deep models from
learning unwanted biases. In particular, we insert an "information bottleneck"
at a certain point of the deep neural network, where we disentangle the
information about the bias, still letting the useful information for the
training task forward-propagating in the rest of the model. One big advantage
of EnD is that we do not require additional training complexity (like decoders
or extra layers in the model), since it is a regularizer directly applied on
the trained model. Our experiments show that EnD effectively improves the
generalization on unbiased test sets, and it can be effectively applied on
real-case scenarios, like removing hidden biases in the COVID-19 detection from
radiographic images
On the Role of Structured Pruning for Neural Network Compression
International audienc
LOss-Based SensiTivity rEgulaRization: towards deep sparse neural networks
LOBSTER (LOss-Based SensiTivity rEgulaRization) is a method for training
neural networks having a sparse topology. Let the sensitivity of a network
parameter be the variation of the loss function with respect to the variation
of the parameter. Parameters with low sensitivity, i.e. having little impact on
the loss when perturbed, are shrunk and then pruned to sparsify the network.
Our method allows to train a network from scratch, i.e. without preliminary
learning or rewinding. Experiments on multiple architectures and datasets show
competitive compression ratios with minimal computational overhead